Goto

Collaborating Authors

 sentiment and semantic analysis


An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses

arXiv.org Artificial Intelligence

Large Language models (LLMs) have been prominent for language translation, including low-resource languages. There has been limited study about the assessment of the quality of translations generated by LLMs, including Gemini, GPT and Google Translate. In this study, we address this limitation by using semantic and sentiment analysis of selected LLMs for Indian languages, including Sanskrit, Telugu and Hindi. We select prominent texts that have been well translated by experts and use LLMs to generate their translations to English, and then we provide a comparison with selected expert (human) translations. Our findings suggest that while LLMs have made significant progress in translation accuracy, challenges remain in preserving sentiment and semantic integrity, especially in figurative and philosophical contexts. The sentiment analysis revealed that GPT-4o and GPT-3.5 are better at preserving the sentiments for the Bhagavad Gita (Sanskrit-English) translations when compared to Google Translate. We observed a similar trend for the case of Tamas (Hindi-English) and Maha P (Telugu-English) translations. GPT-4o performs similarly to GPT-3.5 in the translation in terms of sentiments for the three languages. We found that LLMs are generally better at translation for capturing sentiments when compared to Google Translate.


Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis

arXiv.org Artificial Intelligence

Machine translation using large language models (LLMs) is having a significant global impact, making communication easier. Mandarin Chinese is the official language used for communication by the government and media in China. In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis. In order to demonstrate our framework, we select the classic early twentieth-century novel 'The True Story of Ah Q' with selected Mandarin Chinese to English translations. We use Google Translate to translate the given text into English and then conduct a chapter-wise sentiment analysis and semantic analysis to compare the extracted sentiments across the different translations. Our results indicate that the precision of Google Translate differs both in terms of semantic and sentiment analysis when compared to human expert translations. We find that Google Translate is unable to translate some of the specific words or phrases in Chinese, such as Chinese traditional allusions. The mistranslations may be due to lack of contextual significance and historical knowledge of China.


Evaluating authenticity and quality of image captions via sentiment and semantic analyses

arXiv.org Artificial Intelligence

The growth of deep learning (DL) relies heavily on huge amounts of labelled data for tasks such as natural language processing and computer vision. Specifically, in image-to-text or image-to-image pipelines, opinion (sentiment) may be inadvertently learned by a model from human-generated image captions. Additionally, learning may be affected by the variety and diversity of the provided captions. While labelling large datasets has largely relied on crowd-sourcing or data-worker pools, evaluating the quality of such training data is crucial. This study proposes an evaluation method focused on sentiment and semantic richness. That method was applied to the COCO-MS dataset, comprising approximately 150K images with segmented objects and corresponding crowd-sourced captions. We employed pre-trained models (Twitter-RoBERTa-base and BERT-base) to extract sentiment scores and variability of semantic embeddings from captions. The relation of the sentiment score and semantic variability with object categories was examined using multiple linear regression. Results indicate that while most captions were neutral, about 6% of the captions exhibited strong sentiment influenced by specific object categories. Semantic variability of within-image captions remained low and uncorrelated with object categories. Model-generated captions showed less than 1.5% of strong sentiment which was not influenced by object categories and did not correlate with the sentiment of the respective human-generated captions. This research demonstrates an approach to assess the quality of crowd- or worker-sourced captions informed by image content.